home *** CD-ROM | disk | FTP | other *** search
- SEA Technical Memorandum #0401, ARC 6.02; General Archive Format
- Last updated: April 27, 1989
- Copyright 1989 by System Enhancement Associates, Inc.
-
-
-
- ARC 6.02
-
- General Archive Format
-
-
- The ARC file archive format, created by System Enhancement Associates in
- March of 1985, has previously been documented primarily by the ARC sources
- themselves. The purpose of this document is to provide a separate overview
- of the construction and format of an ARC format archive. It is not our
- intent to document the actual compression algorithms themselves in this
- document. Those remain defined by the ARC sources.
-
-
- An ARC format archive consists of one or more archive entries where each
- entry begins with an "entry header". This header is typically followed by
- data that applies to the header. In the usual case of a header that
- identifies a compressed file, the data is the compressed file.
-
- There are two general categories of entry headers, compressed files and
- control information. Every header without exception begins with an entry
- header marker, which is a single byte with a value of 26 decimal (1A hex,
- "control Z"). This marker byte is immediately followed by a one byte
- "header type code" that identifies the format and type of the header which
- follows.
-
- The majority of all entry headers will be for "standard compressed files",
- and will have the following format:
-
- Offset Length Description
- ------ ------ -----------
- 0 13 Null-terminated filename
- 14 4 Size of the compressed data, in bytes
- 18 2 Creation date, in MS-DOS format
- 20 2 Creation time, in MS-DOS format
- 22 2 Cyclical redundancy check value (CRC)
- 24 4 True length of uncompressed file
-
-
- This is referred to in our documentation as a "standard header". In almost
- all cases an entry header is made to follow the format of a standard header
- as much as possible. At this time it is possible to treat any header that
- is encountered as if it were a standard header, with two exceptions:
-
- * A type one header is an obsolete form of a type two header
- (uncompressed file), which is four bytes shorter. A type one header
- may be converted to a type two header on input by (a) reading four
- bytes less than the full header size, and then (b) setting the size of
- the uncompressed file equal to the size of the compressed data.
-
- * A type zero header marks the end of an archive, and has no header data.
- I.e. an archive will end with an archive marker byte followed by a zero
- byte.
-
-
- Thus, the process for scanning through an ARC format archive picking out
- entry headers may be summed up as follows:
-
- 1) Read one byte for the archive entry marker. If it's not an archive
- entry marker, then exit with an error condition.
-
- 2) Read one byte for the archive header type.
-
- 3) If the entry type is zero, stop.
-
- 4) If the entry type is one, read 24 bytes of header data. Then set
- uncompressed size equal to compressed size.
-
- 5) If the entry type is anything else, read 28 bytes of header data.
-
- 6) Do whatever you had in mind with the header data.
-
- 7) Perform a "relative seek" forward, skipping a number of bytes equal
- to the compressed data size. Return to step (1).
-
-
-
- Header types twenty and up identify extended information, and are described
- in TM0402, "ARC 6.02; Extended Data". Standard compressed files are
- identified as header types one through ninteen, as follows:
-
- Type Compression method
- ---- ------------------
- 1 No compression (short header)
- 2 No compression (standard header)
- 3 Repeated Character Compression
- 4 RCC followed by Huffman
- 5 12 bit Lempel-Ziv
- 6 RCC followed by 12 bit Lempel-Ziv
- 7 RCC followed by 12 bit Lempel-Ziv, alternate hash function
- 8 RCC followed by variable 12 bit Lempel-Ziv with dynamic reset
- 9 variable 13 bit Lempel-Ziv with dynamic reset (nonstandard)
- 10+ Reserved for future use
-
-
- These compression methods all have common names associated with them, as
- follows:
-
- Type Common name
- ---- -----------
- 1 stored
- 2 Stored
- 3 Packed
- 4 Squeezed
- 5 crunched
- 6 crunched
- 7 crunched
- 8 Crunched
- 9 Deviant
- 10+ Other
-
-
-
- Conclusion
- ==========
-
- We hope that anyone seeking to work with ARC format archives finds this
- information of use. If we've left out anything you require, please feel
- free to contact us. We can be reached by voice between 9 AM and 5 PM
- Eastern time at (201) 473-5153. You can also leave a message for us on our
- customer support bulletin board at (201) 473-1991. This is a five-line
- system that is available 24 hours a day at up to 2400 baud. We can also be
- reached by mail at:
-
- System Enhancement Associates, Inc.
- 21 New Street, Wayne NJ 07470
-